Duration Prediction in Mandarin TTS System

نویسندگان

  • Qing GUO
  • Nobuyuki Katae
چکیده

This paper reports the methodology and results of decision tree based duration prediction for a Mandarin text-to-speech system developed by the Fujitsu Laboratories. Syllable initials and finals are the basic units in this duration study. Factors influencing finals duration such as phrase boundary and phone context are discussed in detail. Experiments indicate that it is the most important determinant of finals duration whether the prosodic factor of the right phrase boundary level is below the prosodic word level or not. Furthermore, the degree of phrase boundary vowel lengthening may vary depending on the types of finals. This paper also explains methods for objective evaluation of duration prediction model. Lastly, prosody evaluation results convincing that the prosody generated by our prosody generation module is much better than that of two other popular Mandarin TTS systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decision Tree based Duration Prediction in Mandarin TTS System

This paper reports the methodology and results of decision tree based duration prediction for a Mandarin text-to-speech system developed by the Fujitsu Laboratories. Syllable initials and finals are the basic units in this duration study. Factors influencing finals duration such as phrase boundary and phone context are discussed in detail. Experiments indicate that it is the most important dete...

متن کامل

Duration modeling and memory optimization in a Mandarin TTS system

Current speech synthesis efforts, both in research and in applications, are dominated by methods based on concatenation of spoken units. New progress in the concatenative text-to-speech (TTS) technology can be made mainly from two directions, either by reducing the memory footprint to integrate the system into embedded system, or by improving the synthesized speech quality in terms of intelligi...

متن کامل

Pitch Prediction for Mandarin TTS with Mutual Prosodic Constraint

Most of current pitch prediction methods for mandarin TTS try to get pitch contours from the contextual information with a group of weights assigning. Without a good method in prosody concatenation constraint, the predicted pitch contours are not always stable because of the incomplete accordance between prosody information and text information. The paper presents a new mandarin pitch predictio...

متن کامل

Syllable HMM based Mandarin TTS and comparison with concatenative TTS

This paper introduces a Syllable HMM based Mandarin TTS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results s...

متن کامل

Variable Speech Rate Mandarin Chinese Text-to-Speech System

This paper presents an Hidden Markov Model (HMM)-based variable speech rate Mandarin Chinese text-to-speech (TTS) system. In this system, parameters of spectrum, fundametal frequency and state duration are generated by a context dependent HMM (CDHMM) whose model parameters are linear-interpolated from those of three CDHMMs trained by corpora in three different speech rates (SRs), i.e. fast, med...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006